A new algorithm for mining frequent connected subgraphs based on adjacency matrices

نویسندگان

  • Andrés Gago Alonso
  • Abel Puentes-Luberta
  • Jesús Ariel Carrasco-Ochoa
  • José Eladio Medina-Pagola
  • José Francisco Martínez Trinidad
چکیده

Most of the Frequent Connected Subgraph Mining (FCSM) algorithms have been focused on detecting duplicate candidates using canonical form (CF) tests. CF tests have high computational complexity, which affects the efficiency of graph miners. In this paper, we introduce novel properties of the canonical adjacency matrices for reducing the number of CF tests in FCSM. Based on these properties, a new algorithm for frequent connected subgraph mining called grCAM is proposed. The experiments on real world datasets show the impact of the proposed properties in FCSM. Besides, the performance of our algorithm is compared against some other reported algorithms.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using a Hash-Based Method for Apriori-Based Graph Mining

The problem of discovering frequent subgraphs of graph data can be solved by constructing a candidate set of subgraphs first, and then, identifying within this candidate set those subgraphs that meet the frequent subgraph requirement. In Apriori-based graph mining, to determine candidate subgraphs from a huge number of generated adjacency matrices is usually the dominating factor for the overal...

متن کامل

A Closed Frequent Subgraph Mining Algorithm in Unique Edge Label Graphs

Problems such as closed frequent subset mining, itemset mining, and connected tree mining can be solved in a polynomial delay. However, the problem of mining closed frequent connected subgraphs is a problem that requires an exponential time. In this paper, we present ECE-CloseSG, an algorithm for finding closed frequent unique edge label subgraphs. ECE-CloseSG uses a search space pruning and ap...

متن کامل

Frequent approximate subgraphs as features for graph-based image classification

The use of approximate graph matching for frequent subgraph mining has been identified in different applications as a need. To meet this need, several algorithms have been developed, but there are applications where it has not been used yet, for example image classification. In this paper, a new algorithm for mining frequent connected subgraphs over undirected and labeled graph collections VEAM...

متن کامل

gSpan: Graph-Based Substructure Pattern Mining

We investigate new approaches for frequent graph-based pattern mining in graph datasets and propose a novel algorithm called gSpan (graph-based Substructure pattern mining), which discovers frequent substructures without candidate generation. gSpan builds a new lexicographic order among graphs, and maps each graph to a unique minimum DFS code as its canonical label. Based on this lexicographic ...

متن کامل

Large Scale Graph Representations for Subgraph Census

A Subgraph Census (determining the frequency of smaller subgraphs in a network) is an important computational task at the heart of several graph mining algorithms. Recently, several efficient algorithms have been described. We focus on the g-tries, a data structure that encapsulates the topology of the smaller subgraphs in order to speed up the overall computation. Its algorithm makes extensive...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Intell. Data Anal.

دوره 14  شماره 

صفحات  -

تاریخ انتشار 2010